Analysis of Pairwise Dependency Information Content for Representing and Searching for Transcription Factor Binding Sites
نویسندگان
چکیده
Transcription factors are proteins that are able to bind to certain segments of DNA to control gene expression. We present an improvement upon supervised learning approaches used for finding transcription factor binding sites. We look at binding sites of the same length for a single transcription factor and use the Berg and von Hippel scoring method. Pairwise information content of positional dependency was added to the already implemented pairwise scoring and is shown to increase the accuracy of the scoring method. From this information, we support the idea that there is a pairwise dependency of bases within binding site sequences.
منابع مشابه
Comparative analysis of methods for representing and searching for transcription factor binding sites
MOTIVATION An important step in unravelling the transcriptional regulatory network of an organism is to identify, for each transcription factor, all of its DNA binding sites. Several approaches are commonly used in searching for a transcription factor's binding sites, including consensus sequences and position-specific scoring matrices. In addition, methods that compute the average number of nu...
متن کاملModeling Transcription Factor Binding Sites with Supervised Learning
We present a supervised learning approach to transcription factor binding site modeling for four distinct species. Using the consensus scoring method, we look at binding sites of unequal length and the alignment strategy associated with these binding sites. Pairwise scoring and information content were added to the consensus scoring to further increase accuracy of transcription factor binding s...
متن کاملMapping of Transcription Factor Binding Region of Kappa Casein (CSN3) Gene in Iranian Bacterianus and Dromedaries Camels
κ-casein is a glycosilated protein in mammalian milk that plays an essential role in the milk micelles. Control of κ-casein expression reflects this essential role, although an understanding of the mechanisms involved lags behind that of the other milk protein genes. Transcriptional regulation, a first mechanism for controlling the development of organisms, is carried out by transcription facto...
متن کاملMapping of Transcription Factor Binding Region of Kappa Casein (CSN3) Gene in Iranian Bacterianus and Dromedaries Camels
κ-casein is a glycosilated protein in mammalian milk that plays an essential role in the milk micelles. Control of κ-casein expression reflects this essential role, although an understanding of the mechanisms involved lags behind that of the other milk protein genes. Transcriptional regulation, a first mechanism for controlling the development of organisms, is carried out by transcription facto...
متن کاملBioinformatic principles underlying the information content of transcription factor binding sites.
Empirically, it has been observed in several cases that the information content of transcription factor binding site sequences (R(sequence)) approximately equals the information content of binding site positions (R(frequency)). A general framework for formal models of transcription factors and binding sites is developed to address this issue. Measures for information content in transcription fa...
متن کامل